智能论文笔记

Markowitz mean-variance portfolios with sample mean and covariance as input parameters feature numerous issues in practice. They perform poorly out of sample due to estimation error, they experience extreme weights together with high sensitivity to change in input parameters. The heavy-tail characteristics of financial time series are in fact the cause for these erratic fluctuations of weights that consequently create substantial transaction costs. In robustifying the weights we present a toolbox for stabilizing costs and weights for global minimum Markowitz portfolios. Utilizing a projected gradient descent (PGD) technique, we avoid the estimation and inversion of the covariance operator as a whole and concentrate on robust estimation of the gradient descent increment. Using modern tools of robust statistics we construct a computationally efficient estimator with almost Gaussian properties based on median-of-means uniformly over weights. This robustified Markowitz approach is confirmed by empirical studies on equity markets. We demonstrate that robustified portfolios reach the lowest turnover compared to shrinkage-based and constrained portfolios while preserving or slightly improving out-of-sample performance.

translated by 谷歌翻译

Multidimensional Assignment Problem for multipartite entity resolution

Alla Kammerdiner , Alexander Semenov , Eduardo Pasiliao

分类：人工智能

2021-12-06

Multiparte实体分辨率旨在将记录从多个数据集集成到一个实体中。我们从许多数据集中获得了多脚石实体分辨率的一般记录链接问题的数学制定，作为称为多维分配问题的组合优化问题。作为我们方法的动机，我们说明了通过顺序二分位匹配来实现多党实体分辨率的优势。由于优化问题是NP - 硬，我们应用了两个启发式程序，贪婪算法和非常大的距离邻域搜索，以解决分配问题，并找到从多个数据集中的记录最可能匹配为单个实体。我们评估并比较这些算法的性能及其对综合生成数据的修改。我们执行计算实验以比较最近启发式的性能，非常大规模的邻域搜索，贪婪算法，另一个启发式地图，以及两个版本的遗传算法，一般的成群质算法。重要的是，我们执行实验以比较两种重新开始搜索前启发式的方法，特别是随机采样多开始和基于确定的基于模式的多开始。我们发现证据表明，基于设计的多启动可以更有效，因为数据库的大小变大。另外，我们表明非常大的规模搜索，尤其是它的多启动版本，优于简单的贪婪启发式。贪婪搜索与非常大的邻域搜索的杂交提高了性能。使用多个额外运行的多开始的非常大的尺度搜索，提供了一些改进了非常大的刻度搜索过程的性能。最后，我们提出了一种评估非常大规模的邻居搜索的复杂性的方法。

translated by 谷歌翻译